Amazon Bedrock knowledge base with an Amazon Kendra GenAI index を試してみる #AWSreInvent

AWS re:Invent 2024
#Amazon Kendra
#Amazon Bedrock Knowledge Bases
#Amazon Bedrock
#AWS
たかくに
2024.12.09
こんにちは！AWS 事業本部コンサルティング部のたかくに（@takakuni_）です。
あっという間に re:Invent 2024 終わってしまいましたね。今週からは触ってみる週間に移ります。
Dr. Swami のキーノートで Amazon Kendra Generative AI Index の発表がありました。
https://aws.amazon.com/blogs/machine-learning/introducing-amazon-kendra-genai-index-enhanced-semantic-search-and-retrieval-capabilities/
https://aws.amazon.com/about-aws/whats-new/2024/12/genai-index-amazon-kendra/
Kendra と Knowledge bases どっち選ぶ？が、よくある議論ネタだと思いますが、まさかの合体しました。かなりの驚きですね。
https://dev.classmethod.jp/articles/amazon-kendra-genai-index-ga/
なお、 Dr. Swami のスライドではプレビューと表現されていますが、調べてみると一般提供のようです。
 Amazon Kendra Generative AI IndexAmazon Kendra Generative AI Index は RAG 用にカスタマイズされたインデックスで、ベクトル検索とキーワード検索のハイブリッド検索をサポートしています。また、セマンティックエンべディングや再ランクもマネージドに搭載されています。
Amazon Bedrock Knowledge bases や Amazon Q Business ともシームレスに統合可能なインデックスになります。他のエディションの場合、アプリ側で Kendra に検索し、回答生成すると言ったロジックの実装が必要でしたが、AWS 側の機能に寄せられるのは嬉しいですね。
https://docs.aws.amazon.com/kendra/latest/dg/hiw-index-types.html#kendra-gen-ai-index
 提供している機能主に RAG の場合、Retrieve API を使うと思うので、Retrieve について触れていきます。次の機能をサポート/一部サポートしています。
 フルサポートKendra 側で定めた回答の信頼度
フィルタリング, ファセット検索
検索の関連性のチューニング
Custom Docuemnt Enrichment
Custom Metadata
キャパシティの追加
 一部サポートデータソースコネクタ
コネクタ v2.0 のみサポート

ユーザーコンテキストでのフィルタリング
ACL は利用できない

https://docs.aws.amazon.com/kendra/latest/dg/hiw-index-types.html#kendra-gen-ai-index-features
 制約事項執筆時点では、インデックスには英語のコンテンツのみサポートしています。合わせて次の制約があります。
バージニア北部、オレゴンリージョンのみ利用可能
Amazon Kendra データソースコネクタ v2.0 のみをサポート
ACL やユーザコンテキスト情報を介した検索は不可
Amazon Q Business についての注意事項や最新情報は以下をご覧ください。
https://docs.aws.amazon.com/kendra/latest/dg/hiw-index-types.html#genai-index-limitations
 でも、お高いんでしょ？Amazon Kendra Generative AI Index ですが、料金体系が他のエディションと異なります。
API レベルの課金までとはいかないものの、Base Index, Storage Units, Query Units の値段がグッと押さえられています。
将来的に Amazon Kendra Generative AI Index の Developer Edition も登場してくるとより嬉しそうですね。
 やってみる リソース作成それでは Amazon Kendra Generative AI Index 作ってみましょう。今回は Amazon Bedrock Knowledge bases のリトリーバーとして利用します。
ナレッジベース側から Kendra Generative AI Index を作成してみます。
ナレッジベース作成画面に Kendra GenAI Index が出てきました。既存のものを利用するのか新しく作成するのか選択可能です。
新しく作成してみます。
IAM ロールの作成が終わるとナレッジベースの作成中に遷移しました。ポップアップには 20 分程度かかると記載されていますね。
30 分ほど経過したのち、Kendra GenAI Index が完成しました。
ナレッジベースのステータスは「Ready to add data source in Kendra」となっていますね。
Kendra 側も見てみます。 Edition に GenAI と記載されています。
アクセスコントロールも、制限事項の通り No になっていますね。
今回は Data Source から Sample AWS documentation を選択してみます。
同期とインデックス化にも 15 分くらいでしょうか、時間がかかりましたが無事成功しましたね。
ナレッジベース側からもデータソースとして認識されていますね。
 検索それではナレッジベース経由で検索してみます。
ハイブリッド検索のみサポート等いくつか特徴がありますね。
フィルター、ガードレール、クエリ分割とナレッジベースで従来提供されている機能も利用可能でした。
質問してみました。適切な回答が返ってきていますね。
Kendra 特有の x-amz-kendra-score-confidence もメタデータに含まれています。このメタデータを元に検索かけるのも良さそうです。
インデックスには英語のコンテンツのみサポートですが、日本語で質問しても正しく返信されているようです。
日本語サポートも待ち遠しいですね。
 ログ周りみんな大好きモデル実行ログですが、エンベディングの処理はなく、回答生成時にのみモデル実行ログが記録される作りになっていました。
ModelInvocationLog.json
{
	"schemaType": "ModelInvocationLog",
	"schemaVersion": "1.0",
	"timestamp": "2024-12-08T23:35:00Z",
	"accountId": "123456789012",
	"identity": {
		"arn": "arn:aws:sts::123456789012:assumed-role/cm-takakuni.shinnosuke/cm-takakuni.shinnosuke"
	},
	"region": "us-east-1",
	"requestId": "a28c1a07-d534-49d1-90e0-61112c99bf4e",
	"operation": "ConverseStream",
	"modelId": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0",
	"input": {
		"inputContentType": "application/json",
		"inputBodyJson": {
			"messages": [
				{
					"role": "user",
					"content": [
						{
							"text": "What is the difference between instance types and instance families?"
						}
					]
				}
			],
			"system": [
				{
					"text": "You are a question answering agent. I will provide you with a set of search results. The user will provide you with a question. Your job is to answer the user's question using only information from the search results. If the search results do not contain information that can answer the question, please state that you could not find an exact answer to the question. Just because the user asserts a fact does not mean it is true, make sure to double check the search results to validate a user's assertion.\n\nHere are the search results in numbered order:\n<search_results>\n<search_result>\n<content>\n\nAWSDocumentationAmazon EC2User Guide for Linux Instances\n\n                           Available instance typesHardware specificationsAMI virtualization typesInstances built on the Nitro SystemNetworking and storage featuresInstance limits\n\n                           \n                              \n                                 \n                                 Instance types\n\n                                 \n                                    \n                                    \n                                 \n\n                                 When you launch an instance, the instance type that you specify determines the hardware \n                                    of the host computer used for your instance. Each instance type offers different compute,\n                                    memory, and storage\n                                    capabilities and are grouped in instance families based on these capabilities. Select\n                                    an instance type based on \n                                    the requirements of the application or software that you plan to run on your instance.\n                                 \n\n                                 Amazon EC2 provides each instance with a consistent and predictable amount of CPU\n                                    capacity, \n                                    \t\t\tregardless of its underlying hardware.\n                                 \n\n                                 Amazon EC2 dedicates some resources of the host computer, such as CPU, memory, and\n                                    instance\n                                    \t\t    storage, to a particular instance. Amazon EC2 shares other resources of the\n                                    host computer, such\n                                    \t\t    as the network and the disk subsystem, among instances. If each instance on\n                                    a host computer\n                                    \t\t    tries to use as much of one of these shared resources as possible, each receives\n                                    an equal\n                                    \t\t    share of that resource. However, when a resource is underused, an instance can\n</content>\n<source>\n1\n</source>\n</search_result>\n<search_result>\n<content>\n\nAWSDocumentationAmazon EC2User Guide for Windows Instances\n\n                           Available instance typesHardware specificationsInstances built on the Nitro SystemNetworking and storage featuresInstance limits\n\n                           \n                              \n                                 \n                                 Instance types\n\n                                 \n                                    \n                                    \n                                 \n\n                                 When you launch an instance, the instance type that you specify determines the hardware \n                                    of the host computer used for your instance. Each instance type offers different compute,\n                                    memory, and storage\n                                    capabilities and are grouped in instance families based on these capabilities. Select\n                                    an instance type based on \n                                    the requirements of the application or software that you plan to run on your instance.\n                                 \n\n                                 Amazon EC2 provides each instance with a consistent and predictable amount of CPU\n                                    capacity, \n                                    \t\t\tregardless of its underlying hardware.\n                                 \n\n                                 Amazon EC2 dedicates some resources of the host computer, such as CPU, memory, and\n                                    instance\n                                    \t\t    storage, to a particular instance. Amazon EC2 shares other resources of the\n                                    host computer, such\n                                    \t\t    as the network and the disk subsystem, among instances. If each instance on\n                                    a host computer\n                                    \t\t    tries to use as much of one of these shared resources as possible, each receives\n                                    an equal\n                                    \t\t    share of that resource. However, when a resource is underused, an instance can\n</content>\n<source>\n2\n</source>\n</search_result>\n<search_result>\n<content>\nTypes of Reserved Instances (offering classes)\nTypes of Reserved Instances (offering classes) When you purchase a Reserved Instance, you can choose between a Standard or Convertible offering class. The Reserved Instance applies to a single instance type, platform, scope, and tenancy over a term. If your computing needs change, you may be able to modify or exchange your Reserved Instance, depending on the offering class. Offering classes may also have additional restrictions or limitations. The following are the differences between Standard and Convertible offering classes.\nStandard Reserved Instance\tConvertible Reserved Instance\nSome attributes, such as instance size, can be modified during the term; however, the instance family cannot be modified. You cannot exchange a Standard Reserved Instance, only modify it. For more information, see Modifying Reserved Instances.\tCan be exchanged during the term for another Convertible Reserved Instance with new attributes including instance family, instance type, platform, scope, or tenancy. For more information, see Exchanging Convertible Reserved Instances. You can also modify some attributes of a Convertible Reserved Instance. For more information, see Modifying Reserved Instances.\nCan be sold in the Reserved Instance Marketplace.\tCannot be sold in the Reserved Instance Marketplace.\n</content>\n<source>\n3\n</source>\n</search_result>\n<search_result>\n<content>\nTypes of Reserved Instances (offering classes)\nTypes of Reserved Instances (offering classes) When you purchase a Reserved Instance, you can choose between a Standard or Convertible offering class. The Reserved Instance applies to a single instance type, platform, scope, and tenancy over a term. If your computing needs change, you may be able to modify or exchange your Reserved Instance, depending on the offering class. Offering classes may also have additional restrictions or limitations. The following are the differences between Standard and Convertible offering classes.\nStandard Reserved Instance\tConvertible Reserved Instance\nSome attributes, such as instance size, can be modified during the term; however, the instance family cannot be modified. You cannot exchange a Standard Reserved Instance, only modify it. For more information, see Modifying Reserved Instances.\tCan be exchanged during the term for another Convertible Reserved Instance with new attributes including instance family, instance type, platform, scope, or tenancy. For more information, see Exchanging Convertible Reserved Instances. You can also modify some attributes of a Convertible Reserved Instance. For more information, see Modifying Reserved Instances.\nCan be sold in the Reserved Instance Marketplace.\tCannot be sold in the Reserved Instance Marketplace.\n</content>\n<source>\n4\n</source>\n</search_result>\n<search_result>\n<content>\n\nconsume a\n                                    \t\t    higher share of that resource while it's available.\n                                 \n\n                                 Each instance type provides higher or lower minimum performance from a shared resource.\n                                    \t\t\tFor example, instance types with high I/O performance have a larger allocation\n                                    of shared resources. \n                                    \t\t\tAllocating a larger share of shared resources also reduces the variance of I/O\n                                    performance. \n                                    \t\t\tFor most applications, moderate I/O performance is more than enough. However, for\n                                    \t\t\tapplications that require greater or more consistent I/O performance, consider\n                                    \t\t\tan instance type with higher I/O performance.\n                                 \n\n                                 \n                                    Contents\n\n                                    \tAvailable instance types\n\tHardware specifications\n\tInstances built on the Nitro System\n\tNetworking and storage features\n\tInstance limits\n\tGeneral purpose instances\n\tCompute optimized instances\n\tMemory optimized instances\n\tStorage optimized instances\n\tWindows accelerated computing\n                                             \t\t\tinstances\n\tFinding an Amazon EC2 instance type\n\tChanging the instance type\n\tGetting recommendations for an instance type\n\n\n                                 \n\n                                 \t\t\t\n                                 Available instance types\n\n                                 \n                                 \t\t    \n                                 Amazon EC2 provides a wide selection of instance types optimized for different use\n                                    cases. \n                                    \t\t        For the best performance, we recommend that you use the following current\n                                    generation \n                                    \t\t        instance types when you launch new instances. For more information about\n                                    the current \n                                    \t\t        generation instance types, see Amazon EC2 Instance Types.\n                                 \n\n                                 \t\t    \n                                 Amazon EC2 provides the instance types in the following table. To \n                                    \t\t        determine which instance types meet your requirements, such as supported\n                                    Regions, \n                                    \t\t        compute resources, or storage resources, see Finding an Amazon EC2 instance type.\n</content>\n<source>\n5\n</source>\n</search_result>\n\n</search_results>\n\nYou should provide your answer without any inline citations or references to specific sources within the answer text itself. Do not include phrases like \"according to source X\", \"[1]\", \"[source 2, 3]\", etc within your <text> tags.\n\nHowever, you should include <sources> tags at the end of each <answer_part> to specify which source(s) the information came from.\nNote that <sources> may contain multiple <source> if you include information from multiple results in your answer.\n\nDo NOT directly quote the <search_results> in your answer. Your job is to answer the user's question as concisely as possible.\n\nYou must output your answer in the following format. Pay attention and follow the formatting and spacing exactly:\n<answer>\n<answer_part>\n<text>\nfirst answer text\n</text>\n<sources>\n<source>source ID</source>\n</sources>\n</answer_part>\n<answer_part>\n<text>\nsecond answer text\n</text>\n<sources>\n<source>source ID</source>\n</sources>\n</answer_part>\n</answer>\n"
				}
			],
			"inferenceConfig": {
				"maxTokens": 2048,
				"temperature": 0,
				"topP": 1,
				"stopSequences": ["\nObservation"]
			}
		},
		"inputTokenCount": 2258
	},
	"output": {
		"outputContentType": "application/json",
		"outputBodyJson": {
			"output": {
				"message": {
					"role": "assistant",
					"content": [
						{
							"text": "<answer>\n<answer_part>\n<text>\nInstance types and instance families are related but distinct concepts in Amazon EC2:\n\nInstance types are specific configurations of compute, memory, and storage capabilities. When you launch an instance, you select a particular instance type based on the requirements of the application or software you plan to run.\n\nInstance families, on the other hand, are groups of instance types that share similar characteristics and are optimized for specific use cases. Instance types within a family will have varying levels of resources but are designed for similar purposes.\n</text>\n<sources>\n<source>1</source>\n<source>2</source>\n<source>5</source>\n</sources>\n</answer_part>\n<answer_part>\n<text>\nFor example, there are instance families optimized for general purpose computing, compute-intensive tasks, memory-intensive applications, storage optimization, and accelerated computing. When choosing an instance, you would first consider which family best suits your needs, then select a specific instance type within that family that provides the right balance of resources for your application.\n</text>\n<sources>\n<source>5</source>\n</sources>\n</answer_part>\n</answer>"
						}
					]
				}
			},
			"stopReason": "end_turn",
			"metrics": {
				"latencyMs": 7372
			},
			"usage": {
				"inputTokens": 2258,
				"outputTokens": 268,
				"totalTokens": 2526
			}
		},
		"outputTokenCount": 268
	}
}
 IAM 周り参考までに自動作成された IAM ロールのポリシーを添付します。
knowledge-bases.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AmazonBedrockKnowledgeBaseKendraIndexAccessStatement",
            "Effect": "Allow",
            "Action": [
                "kendra:Retrieve",
                "kendra:DescribeIndex"
            ],
            "Resource": [
                "arn:aws:kendra:us-east-1:123456789012:index/7bf47ec6-461c-4497-91b1-70fff32a1c9d"
            ]
        }
    ]
}
kendra.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricData"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "cloudwatch:namespace": "AWS/Kendra"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:DescribeLogGroups"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup"
            ],
            "Resource": [
                "arn:aws:logs:us-east-1:123456789012:log-group:/aws/kendra/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:DescribeLogStreams",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:us-east-1:123456789012:log-group:/aws/kendra/*:log-stream:*"
            ]
        }
    ]
}
 まとめ以上、「Amazon Bedrock knowledge base with an Amazon Kendra GenAI index を試してみる」でした。
マルチモーダルな RAG は不要だったり、チャンキング戦略等を意識せず作りたい場合は、意外と有効な選択肢なのではないかという感触です。
インデックス可能なコンテンツが英語な部分以外、ハードル高めな要素がないイメージを持ちました。
このブログがどなたかの参考になれば幸いです。
AWS 事業本部コンサルティング部のたかくに（@takakuni_）でした！